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1. Introduction 

An increasing number of stocks are traded in electronic, order-driven markets, in which orders to 
buy and seh are centrahzed in a limit order hook available to to market participants and market 
orders are executed against the best available offers in the limit order book. The dynamics of prices 
in such markets are not only interesting from the viewpoint of market participants -for trading 



and order execution (Alfonsi et al. (2010), Predoiu et al. (2011))- but also from a fundamental 



perspective, since they provide a rare glimpse into the dynamics of supply and demand and their 
role in price formation. 



Equilibrium models of price formation in limit order markets (Parlour (1998), Rosu (2009)) have 



shown that the evolution of the price in such markets is rather complex and depends on the state 



of the order book. On the other hand, empirical studies on limit order books (Bouchaud et al 



( |2008l ), [Fa?iner et al.| ( |2004l ) , [Gourieroux et al.| p999l ) , |Hollifield et al.| ( |2004| ) , [Smith et al.| ( |2003D ) 
provide an extensive list of statistical features of order book dynamics that are challenging to 
incorporate in a single model. While most of these studies have focused on unconditional/steady- 



state distributions of various features of the order book, empirical studies (see e.g. Harris and 
Panchapagesan pOOS, )) show that the state of the order book contains information on short-term 
price movements so it is of interest to provide forecasts of various quantities conditional on the 
state of the order book. Providing analytically tractable models which enable to compute and/or 
reproduce conditional quantities which are relevant for trading and intraday risk management 
has proven to be challenging, given the complex relation between order book dynamics and price 
behavior. 

The search for tractable models of limit order markets has led to the development of stochastic 
models which aim to retain the main statistical features of limit order books while remaining 
computationally manageable. Stochastic models also serve to illustrate how far one can go in 
reproducing the dynamic properties of a limit order book without resorting to detailed behavioral 
assumptions about market participants or introducing unobservable parameters describing agent 
preferences, as in more detailed market microstructure models. 

Starting from a description of order arrivals and cancelations as point processes, the dynamics 



of a limit order book is naturally described in the language of queueing theory. Engle and Lunde 



(2003) formulates a bivariate point process to jointly analyze trade and quote arrivals. Cont et al 



(2010b) model the dynamics of a limit order book as a tractable multiclass queueing system and 



compute various transition probabilities of the price conditional on the state of the order book, 
using Laplace transform methods. 

1.1. Summary 

We propose in this work a Markovian model of a limit order market, which captures some salient 
features of the dynamics of market orders and limit orders, yet is even simpler than the model of 



Cont et al. (2010b) and enables a wide range of properties of the price process to be computed 



analytically. 

Our approach is motivated by the observation that, if one is primarily interested in the dynamics 
of the price, it is sufficient to focus on the dynamics of the (best) bid and ask queues. Indeed, 
empirical evidence shows that most of the order flow is directed at the best bid and ask prices 



(Biais et al. (1995)) and the imbalance between the order flow at the bid and at the ask appears 



to be the main driver of price changes (Cont et al. (2010a)). 

Motivated by this remark, we propose a parsimonious model in which the limit order book is 
represented by the number of limit orders {q^,q1) sitting at the bid and the ask, represented as a 
system of two interacting queues. The remaining levels of the order book are treated as a 'reservoir' 
of limit orders represented by the distribution of the size of the queues at the 'next-to-best' price 
levels. Through its analytical tractability, the Markovian version of our model allows to obtain 
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analytical expressions for various quantities of interest such as the distribution of the duration until 
the next price change, the distribution and autocorrelation of price changes, and the probability 
of an upward move in the price, conditional on the state of the order book. 



Compared with econometric models of high frequency data Engle and Russell (1998), Engle 



and Lunde (2003) where the link between durations and price changes is specified exogenously, 
our model links these quantities in an endogenous manner, and provides a first step towards joint 
'structural' modeling of high frequency dynamics of prices and order flow. 

A second important observation is that order arrivals and cancelations are very frequent and 
occur at millisecond time scale, whereas, in many applications such as order execution, the metric of 
success is the volume-weighted average price (VWAP) so one is interested in the dynamics of order 
flow over a large time scale, typically tens of seconds or minutes. As shown in Table [lT| thousands 
of order book events may occur over such time scales. This aggregation of events actually simplifies 
much of the analysis and enables us to use asymptotic methods. We study the link between price 
volatility and order flow in this model by studying the diffusion limit of the price process. In 
particular, we express the volatility of price changes in terms of parameters describing the arrival 
rates of buy and sell orders and cancelations. These analytical results provide some insight into 
the relation between order flow and price dynamics in order-driven markets. Comparison of these 
asymptotic results with empirical data shows that main insights of the model to be correct: in 
particular, we show that in limit order markets where orders arrive frequently, the volatility of 
price changes is increase with the ratio of the order arrival intensity to market depth, as predicted 
by our model. 





Average no. of 


Price changes 




orders in 10s 


in 1 day 


Citigroup 


4469 


12499 


General Electric 


2356 


7862 


General Motors 


1275 


9016 



Table 1 Average number of orders in 10 seconds and number of price changes (June 26th, 2008). 



1.2. Outline 

The paper is organized as follows. Section [2] introduces a reduced-form representation of a limit 
order book and presents a Markovian model in which limit orders, market orders and cancellations 
occur according to Poisson processes. Section [3] presents various analytical results for this model: we 
compute the distribution of the duration until the next price change (section 3.1), the probability 



of upward move in the price (section 3.2) and the dynamics of the price (section 3.3). In Section 
|4j we show that the price behaves, at longer time scales, as a Brownian motion whose variance is 
expressed in terms of the parameters describing the order flow, thus establishing a link between 
volatility and order flow statistics. 



2. A Markov model of limit order book dynamics 
2.1. Level-1 representation of a limit order book 

Empirical studies of limit order markets suggest that the major component of the order flow occurs 



at the (best) bid and ask price levels (see e.g. Biais et al. (1995)). Furthermore, studies on the price 
impact of order book events show that the net effect of orders on the bid and ask queue sizes is 
the main factor driving price variations (Cont et al. ( 2010a[ )). These observations, together with 
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the fact that queue sizes at the best bid and ask ( "Level I" order book) are more easily obtainable 
(from trades and best quotes) than Level II data, motivate a reduced-form modeling approach in 
which we represent the state of the limit order book by 

• the bid price s\ and the ask price si 

• the size of the bid queue q\ representing the outstanding limit buy orders at the bid, and 

• the size of the ask queue representing the outstanding limit sell orders at the ask 
Figure 1 summarizes this representation. 



Quantities 



la 



Price 



Figure 1 Simplified representation of a limit order book. 



The bid and ask prices are multiples of the tick size 5. As shown in Table 2.1, for liquid stocks 
the hid' ask spread — s\ is equal to one tick for more than 98% of observations. We will therefore 
make the simplifying assumption that the spread is equal to one tick, i.e. s^: = s\-\- 5, resulting in 
a further reduction of dimension in the model. 



Bid-ask spread 


1 tick 


2 tick 


> 3 tick 


Citigroup 


98.82 


1.18 





General Electric 


98.80 


1.18 


0.02 


General Motors 


98.71 


1.15 


0.14 



Table 2 Percentage of observations with a given bid-ask spread (June 26th, 2008). 



The state of the limit order book is thus described by the triplet Xt 
values in the discrete state space x N^. 



(■s* ! <?t ' which takes 



2.2. Order book dynamics 

The state Xt of the order book is modified by order hook events: limit orders (at the bid or ask), 



market orders and cancelations (see Cont et al. ( 2010b|a[ ), Smith et al. (2003)). A limit buy (resp. 



sell) order of size x increases the size of the bid (resp. ask) queue by x, while a market buy (resp. 
sell) order decreases the corresponding queue size by x. Cancellation of x orders in a given queue 
reduces the queue size by x. Given that we are interested in the queue sizes at the best bid/ask 
levels, market orders and cancellations have the same effect on the state variable Xt. 
We will assume that these events occur according to independent Poisson processes: 
• Market buy (resp. sell) orders arrive at independent, exponential times with rate 
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• Limit buy (resp. sell) orders at the (best) bid (resp. ask) arrive at independent, exponential 
times with rate A, 

• Cancellations occur at independent, exponential times with rate 9. 

• These events are mutually independent. 

• All orders sizes are equal (assumed to be 1 without loss of generality). 

Denoting by {T.°',i > 1) (resp. Tj^) the times at which the size of ask (resp. the bid) queue changes 
and (resp. V"-) the size of the associated change in queue size, the above assumptions translate 
into the following properties for the sequences T°',Tl',V^,V^: 

(i) (T^°^i — T°-)iyo is a sequence of independent random variables with exponential distribution 
with parameter X + 9 + fi, 

(ii) {Tj'^-^^ — Tj^)i>o is a sequence of independent random variables with exponential distribution 
with parameter X + 6 + n, 

(iii) {V°')i>o is a sequence of independent random variables with 

(iv) (V^)i>o is a sequence of independent random variables with 

F[V^ = l] = —^— and P[y/ = -l] ^ + ^ 



x + n + 9 ' ' x+fi + e 

• All the previous sequences are independent. 
Once the bid (resp. the ask) queue is depleted, the price will move to the queue at the next level, 
which we assume to be one tick below (resp. above). The new queue size then corresponds to what 
was previously the number of orders sitting at the price immediately below (resp. above) the best 
bid (resp. ask). Instead of keeping track of these queues (and the corresponding order flow) at 



all price levels (as in [Cont et al. (2010b), Smith et al. (2003)), we treat these sizes as stationary 



variables drawn from a certain distribution / on N^. Here f{x,y) represents the probability of 
observing {q^,q'^) = {x,y) right after a price increase. Similarly, we denote f{x,y) the probability 
of observing {ql,q1) = {x,y) right after a price decrease. More precisely, denoting by Tt the history 
of prices and order book events on [0,t], 

• if g°_ = then (g^ , g°) is a random variable with distribution /, independent from J-t-- 

• if = then (g^ , g°) is a random variable with distribution /, independent from J-'t-- 
Given the independence assumptions on event types, the probability that these two situations occur 
simultaneously is zero. 

Remark 1. The asumption that (gj,g^)is independent from J-f^ is not necessary. If one only 
assume that the random variables used to replace the quantity of orders once the price moves are 
stationnary, all the results from this paper remain valid. However, without this assumption, the 
process {qt,Qt)t>o becomes non-Markovian. 

The distributions / and / summarize the interaction of the queues at the best bid /ask levels with 
the rest of the order book, viewed here as a 'reservoir' of limit orders. For simplicity we shall 
assume f{x,y) = f{y,x) i.e. events occurring on the bid and on the ask side have similar statistical 
properties but our analysis may be readily extended to the asymmetric case. Figure [2] shows the 
(joint) empirical distribution of bid and ask queue sizes after a price move for Citigroup stock on 
June 26th 2008. 

Under these assumptions qt = {q^^ql) is thus a Markov process, taking values in N^, whose 
transitions correspond to the order book events > 1} UlTfji > 1}: 

• At the arrival of a new limit buy (resp. sell) order the bid (resp. ask) queue increases by one 
unit. This occurs at rate A. 
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Figure 2 Joint density of bid and ask queue sizes after a price move (Citigroup, June 26th 2008). 



• At each cancellation or market order, which occurs at rate + fj,, either: 

(a) the corresponding queue decreases by one unit if it is > 1, or 

(b) if the ask queue is depleted then qt is a random variable with distribution /. 

(c) if the bid queue is depleted then qt is a random variable with distribution /. 

The values of A and ii-\-9 are readily estimated from high-frequency records of order books (see 



Cont et al. (2010b) for a description of the estimation procedure). Table 2.2 gives examples of such 



parameter estimates for the stocks mentioned above. We note that in all cases X < /i + O but that 
the difference is small: |(/i + 0) — A| ^ A. 





A 


fi + e 


Citigroup 


2204 


2331 


General Electric 


317 


325 


General Motors 


102 


104 



Table 3 Estimates for the intensity of limit orders and market orders+cancellations, in number of batches per 
second (each batch representing 100 shares) on June 26th, 2008). 



2.3. Price dynamics 

When the bid or ask queue is depleted, the price moves up or down to the next level of the order 
book. We will assume that the order book contains no 'gaps' (empty levels) so that these price 
increments are equal to one tick: 

• When the bid queue is depleted, the price decreases by one tick. 

• When the ask queue is depleted, the price increases by one tick. 

If there are gaps in the order book, this results in 'jumps' (i.e. variations of more than one tick) in 
the price. The price process is thus a piecewise constant process whose transitions correspond 
to hitting times of the {(0, y), y G N} U {(x, 0), x G N} by the Markov process qt = {qt,qt)- 
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2.4. Summary 

In summary, the process Xt = {s^,q^,q^) is a continuous-time process with right-continuous, piece- 
wise constant sample paths whose transitions correspond to the order book events > 1} U 
{T;^^> 1}. At each event: 

• If an order or cancelation arrives on the ask side i.e. T G {T^,i > 1}: 

• If an order or cancelation arrives on the bid side i.e. T G {T.i,i > 1}: 

where {Vf-)i>i and {Vl').i>i are sequences of IID variables with distribution given by Q-Q, 
{Ri)i>i = {Ri, R1)i>i is a sequence of IID variables with (joint) distribution /, and {Ri)i>i = 
{R\,R1)i>i is a sequence of IID variables with (joint) distribution /. 

Remark 2 (Independence assumptions). The IID assumption for the sequences (-R„), {Rn) is 
only used in Section [4j The results of Section [3] do not depend on this assumption. 



2.5. Quantities of interest 

In applications, one is interested in computing various quantities that intervene in high frequency 
trading such as: 

• the conditional distribution of the duration between price moves, given the state of the order 
book (Section 3.1 ), 



• the probability of a price increase, given the state of the order book (Section 3.2), 

• the dynamics of the price : autocorrelations and distribution and autocorrelations of price 
changes (section 3.3), and 

• the volatility of the price (section [4]) . 

We will show that all these quantities may be characterized analytically in this model, in terms of 
order flow statistics. 



3. Analytical results 

The high-frequency dynamics of the price may be described in terms of durations between successive 
price changes and the magnitude of these price changes. Given that the state of the (Level I) order 
book is observable, it is of interest to examine what information the current state of the order 
book gives about the dynamics of the price. We now proceed to show how the model presented 
above may be used to compute the conditional distributions of durations and price changes, given 
the current state of the order book, in terms of the arrival rates of market orders, limit orders 
and cancellations. The result of this section do not depend on the assumptions on the sequences 

{Rn), (Rn)- 

3.1. Duration until the next price change 

We consider first the distribution of the duration until the next price change, starting from a given 
configuration (6, a) of the order book. We define 

• (Ta the first time when the ask queue (g^,* > 0) is depleted, 

• cjf, the first time when the bid queue (g^ ,i > 0) is depleted 
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Since the queue sizes are constant between events, one can express these stopping times as: 
a" = inf{7^", + Vt = 0} a^ = inf{7;^ gL_ + = 0} 

^ i 

The price {st,t> 0) moves when the queue qt = {ql, g°) hits one of the axes: the duration until the 
next price move is thus 

The foUowing theorem gives the distribution of the duration r, conditional on the initial queue 
sizes: 

Proposition 1 (Distribution of duration until next price move). The distribution of r 
conditioned on the state of the order hook is given by: 



[t > t\q^ = a, q', = b] = ^/ ,^ (3) 



/"OO 

where V'n,A,9+p(t) = / -In{2VMeT^u)e-''^''''''''^du 

Jt ^ 



(4) 



and In is the modified Bessel function of the first kind. The conditional law of t has a regularly 
varying tail 

• with tail exponent 2 if X< + 9 

• with tail exponent 1 if X = fj, + 9. In particular, if X = fj, + 9, £'[t|(7q = a, = 5] = 00 whenever 
a>0,b>0. 

Proof. Since (g", t>0) follows a birth and death process with birth rate A and death rate fJ, + 9, 
C{s,x) :=E[e"*'^'^|go =x] satisfies: 

^, , \C{s,x + l) + {fi + e)C{s,x-l) 

L[S,X) = r . 

We can find the roots of the polynomial: XX"^ — {X + fi + 9 + s)X + fj, + 9; one root is > 1, the other 
is < 1; since C{s, 0) = 1 and lim2,_j.oo x) = 0, 

,_, {X + l^ + 9 + s)-^{{X + ii + e + s)y-4X{fi + 9) ^^ 

L-{S, X) — [ 2^ ) . 

Moreover if we use the relation P[r > t\q^ = x,q^ = y]= P[cr" > t\q^ = x]F[a'' > tlq^ = y], 



[t > t\qQ = x,qQ = y] = / C{u,x)du / C{u,y)du 



This Laplace transform may be inverted (see (Feller 1971 XIV. 7)) and the inversion yields 

Cit,x) = -^^li^^^ 4(2yA(^t)e-*(^+'^+-), 

which gives us the expected result. 
Tail behavior of r: 
• UX<^i + 9: 

r( \ I \x 1 x{X + ^l + 9) 
C{s,x) = a{s) ~„ 1-^X0;^^-, 
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2 2.5 3 

t (seconds) 




t (seconds) 

Figure 3 Above: P(r > t\qQ — 4,qo = 5) as a function of t for X — 12, jj. + 9 — 13. Below: same figure in log-log 
coordinates. Note the Pareto tail which decays as t~^. 



SO Karamata's Tauberian theorem (Feller 1971, XIII. 5) yields 
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therefore the conditional law of the duration r is a regularly varying with tail index 2 

• If the order flow is balanced i.e. X = fi + 9 then 

£(s,x) = (a(s))" ~ 



the law of cr° is regularly- varying with tail index 1/2 and 

X 1 



>-oo 



'vr 



XVi' 



The duration then follows a heavy-tailed distribution with infinite first moment: 



F[T>t\q^,=x,q', = y]^ (6) 

The expression given in ([S]) is easily computed by discretizing the integral in Q. Plotting ^ for 
a fine grid of values of t typically takes less than a second on a laptop. Figure 3 gives a numerical 
example, with A = 12 sec~\/i + ^ = 13 sec~\(?o = 4, = 5 (queue sizes are given in multiples of 
average batch size). 

3.2. Probability of upward move in the price for a balanced limit order book 

Assume now that X = + 6, i.e. that the flow of limit orders is balanced by the flow of market 
orders and cancellations. Therefore for all t <t, qt = Mj^^^^, where (M„,r2- > 0) is a symmetric 
random walk on killed when it hits either the x-axis or the y-axis and {N2\t,t > 0) is a Poisson 
process with parameter 2A. Hence the probability of an upward move in the price starting from 
a configuration q\ = n,q1 = p for the order book is equal to the probability that the random walk 
M starting from (n,p) hits the x-axis before the y-axis. This probability is given by the following 
proposition: 

Proposition 2. For {n,p) G N^, the probability (f>{n,p) that the next price move is an increase, 
conditioned on having the n orders on the hid side and p orders on the ask side is: 

4>{n,p) = - [\2- cos(t) - V(2 - cos(t))^ - l)^ '^''^'^^^.7'^^^ dt. (7) 



vrJo sin(|) 

Proof. The generator of the bivariate random walk (Mn,n > 1) is the discrete Laplacian so 
(l){n,p) = P[(7a < 0"{,|gg_ = n, gQ_ = p] satisfies, for all n > 1 and p>l, 

A4i{n,p) = 0(n+ l,p) + 0(n- l,p) + 4>{n,p+ 1) +(j){n,p- 1), (8) 

with the boundary conditions: (j){0,p) = for all p > 1 and (j){n,0) = 1 for all n > 1. This problem 
is known as the discrete Dirichlet problem; solutions of ([s]) are called discrete harmonic functions. 
(Lawler and Limic 2010, Ch. 8) show that for all t > 0, the functions 

= e''''^'^ sin(yt), and /t(a;, y) = e"'"'^*^ sin(yt) with r(t) = cosh~^(2 - cost) 



are solutions of ([8j). In (Lawler and Limic 2010, Corollary 8.1.8) it is shown that the probability 
that a simple random walk {Mk,k > 1) starting at (n,p) G Z+ x Z+ reaches the axes at (x,0) is 

2 



vr 



f e-'''-'^Psm{nt)sm{tx)dt, 
Jo 
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therefore 

k=l ^ 



Since 

" . _ sm(f ) sm(^^) _ cos(|) - cos((m + 
2^sm(/ctJ- ^.^^^^2^ - 2sm(t/2) 



k=l 

using integration by parts we see that the second term leads to the integrah 

2 -^0 



/ . \ ^ os((m + l/2)t)dt = / g'{t)sin{{m+^)t)dt ^ 0. 

Jo sin(t/2) m+-jQ 2 m^oo 



9(t) 

since g' is bounded. So finahy: 



(j)(n,p) = - e''^^*'^Psm(tn)^-^dt. 

Noting that e^'"'^*^ = (2 — cos(t) — \/{2 — cos{t)y — 1) we obtain the result. 

Note that the conditional probabilities ([T]) are, in the case of a balanced order book, independent 
of the parameters describing the order flow. 

The expression ([T]) is easily computed numerically: Figure 4 displays the shape of the function 
The comparison with the corresponding empirical transition frequencies for CitiGroup tick-by-tick 
data on June 26, 2008 shows good agreement between the theoretical conditional probabilities and 
their empirical counterparts. 

3.3. Dynamics of the price 

The high-frequency dynamics of the price in this model is described by a piecewise constant, right 
continuous process {st,t > 0) whose jumps times correspond to times when the order book process 
{Qt,t> 0) hits one of the axes. Denote by (ti, r2, ...) the successive durations between price changes. 
The number of price changes that occur during [0,t] is given by 

Nt := max{ n > 0, Ti -|- ... -|- r„ < t } 

At t = Ti, Sr- = Sr-_ -|- 1 if ^i-" =0 and Sr- = Sr _ — 1 if q^b = 0. (Xi, X2, X^, Xn, ■■■) are the 
successive moves in the price. Note that in general this is not a sequence of independent random 
variables. We define for n > 1, 

n 

Zn = Xi 

2 = 1 

the value of the price, after n changes. Hence, for all t>0, St = Z^^. 

Proposition 3. Let pcom = P[X2 = S\Xi = 6] = F[X2 = -S\Xi = -6] be the probability of two 
successive price moves in the same direction. 
. Vfc > 1, Cov{X„Xk) = (2p,„„, - 

• Conditional on the current state of the limit order book, the distribution of the n-th subsequent 
price change X„ is: 

, . l + (2p,<,„,-l)"-i(2pi(x,y)-l) 



p^ix,y):=F[X,, = d\q^,=x, = y] 
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30 



Ask queue 

Bid queue 




p 

Figure 4 Above: Conditional probability of a price increase, as a function of the bid and ask queue size. Below: 
comparison with transition frequencies for CitiGroup tick-by-tick data on June 26, 2008. 



Proof. Let, for {x,y) G N^, and for all n > 2, pn{x,y) the probability that X„ = 5, conditioned 
on = X and = y. To simplify, we note p„ for pn{x,y). Pn is characterized by the following 
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recurrence relation: 

(Pn \ _ ( Pcont ^ ~ Pco7it\ ( Pn-1 \ 
1 - Pn) \l- Pcont Pcont ) \} - Pn-1 J ' 

hence 

( \ = ( 1- Pcont V f Pi \ 

\'^-PnJ \l- Pcont Pcont J \1 - Pl J ' 

The eigenvalues of this matrix are 1 and 2pcont — ^- 

\ (1/2 1/2 \ 
l) {l/2 -1/2) ■ 

1) 



Cov{Xi,X,,) =PiPn + {l-Pn){l-Pl) -Pl{l-Pn) -Pn(l "Pl) 



/ Pcont 1 - Pcont\ ^A^jA ^ 

Pcont Pcont / - 1/ \0 2pconf " 

Therefore 

l + (2pcont-l)"-'(2pi- 
Pn= ^ 

Moreover for all n > 2, 



Cov{Xi,X,,) = (1 + 2p„pi -p„ -pi) 

Co«(Xi,X„) = (2p,„„,-l)"-i. 

Remark 3 (Negative autocorrelation of price changes at first lag). It is empiri- 
cally observed that high frequency price movements have a negative autocorrelation at the first lag 



Cont (2001). In our model Cov{Xk,Xk+i) < if and only \i pcont < 1/2, which happens when 



^^f{i,j)> 1/2 

i—1 j>i 

where / is the joint distribution of queue sizes after a price increase. This condition is verified on 
all high-frequency data sets we have examined. For example, for CitiGroup stock we find 

oo 

^^f{i,j) >0.7 

i—l j>i 



This asymmetry condition on / corresponds to the fact that, after an upward price move, the new 
bid queue is generally smaller than the ask queue since the ask queue corresponds to the limit 
order previously sitting at second best ask level, while the bid queue results from the accumulation 
of orders over the very short period since the last price move. Under this condition, high frequency 
increments of the price are negatively correlated: an increase in the price is more likely to be 
followed by a decrease in the price. 

Remark 4. The sequence of price increments {Xi,X2, ■■■) is uncorrelated if and only if Pcont = 1/2 
which happens when 

oo 

i—l j'>i 
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4. Difrusion limit of the price process 



As discussed in Section 3.3, the high frequency dynamics of the price is described by a piecewise 



constant stochastic process St = Zjy^ where 

Zn = Xi + ... + Xn and iVj = supjfc; Ti + -.. + Tfc<t} 

is the number of price moves during [0,t]. 

However, over time scales much larger than the interval between individual order book events, 
prices are observed to have diffusive dynamics and modeled as such. To establish the link between 
the high frequency dynamics and the diffusive behavior at longer time scales, we shall consider a 
time scale t„ = tC{n) over which the average number of order book events is of order n and exhibit 
conditions under which the rescaled price process 

^%,t>0)„>i 
n 



verifies a functional central limit theorem i.e. converges in distribution to a non-degenerate process 
iPt,i ^0) as n — 7- oo. The choice of the time scale t„ = tC{n) cannot be arbitrary: it is imposed by 



the distributional properties of the durations which, as observed in Section 3.1, are heavy tailed. 
More precisely, C{^) is chosen such that 

Ti + ... + r„ 

C(n) 

has a well-defined limit. In this section, we show that, under a symmetry condition, this limit can 
be identified as a diffusion process whose diffusion coefficient may be computed from the statistics 
of the order flow driving the limit order book. 

Assume X + 9 < fi and that the joint distribution / of the queue sizes after a price move satisfies: 

oo oo 

D{f) = J2T.'^fiid)<^ (9) 

The quantity D{f) represents a measure of market depth: more precisely, y^D{F) is the geometric 
average of the size of the bid queue and the size of the ask queue after a price change. 

In this section we assume that the distribution / is symmetric with respect to its arguments: 
Vi, j > 0, f{i,j) = f{j,i)- Under this assumption, the sequence of increments (X^i > 0) of the 
price is a sequence of independent random variables. We will show that the limit p is then a 
diffusion process which describes the dynamics of the price at lower frequencies. In particular, we 
will compute the volatility of this diffusion limit p and relate it to the properties of the order flow. 

In the following D denotes the space of right continuous paths uj : [0,oo) — )■ with left limits, 
equipped with the Skorokhod topology Ji, and =^ will designate weak convergence on {V, Ji) (see 



Billingsley (1968), Whitt (2002) for a discussion). 



4.1. Balanced order book 

We first consider the case of a balanced order flow for which the intensity of market orders and 
cancelations is equal to the intensity of limit orders. The study of high-frequency quote data 
indicates that this is an empirically relevant case for many liquid stocks. 

Theorem 1 If\ = ii + e, 



vrA 



' - y \^ D{f) 

where 5 is the tick size, D{f) is given by ^ and W is a standard Brownian motion. 
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Proof. For all t > and n > 1 , let tn=n log nt and 



Snio^nt _ Z{ti:\/D{f))l ^ f Z{N,,^)5 _ Zit7rX/D{f))6 \ ^^^^ 



n \ n 



Using Donsker's invariance principle, the sequence of processes ( — ^- /= — > 0) converges 

\ n 



in {T>,Ji) to a Brownian motion with volatility 6\l J^^j^- P '■ i^^^) ^ (IjOo) be a function 



satisfying: 



Since p{t) ~t^oo 



log(t) 



p{t)logip{t)) = t 



Dif) ' Z?(/)log(C(n))' ^''^ 



n— >-oo 



*%Wooi?(/) 



Therefore for all t > 0, 



Therefore the finite dimensional distributions of the sequence of processes 

'Z{NtJ6 Z(t-KX/D{f))6\ . n- , ■ 

converge to a point mass at zero . hmce this sequence oi processes 



( Z{N,J5 _ Z{tnX/Dif))5 \ „^oo ^ 

V J 



n \/n ) 



t>0 



is tight on (P, Ji), it converges weakly to zero on (P, Ji) (see Whitt (2002)). Finally, 



n ' - J ^ D{f) 

4.2. Empirical test using liigli-frequency data 

Theorem [T] relates the 'coarse-grained' volatility of intraday returns at lower frequencies to the 
high-frequency arrival rates of orders. Denote by tq = 1 /A the typical time scale separating order 
book events. Typically Tq is of the order of milliseconds. In plain terms. Theorem [T] states that, 
observed over a time scale T2 » tq (say, 10 minutes), the price has a diffusive behavior with a 
diffusion coefficient given by 

where 5 is the tick size, n is an integer verifying nlnn Tq = T2 which represents the average number 
of orders during an interval T2 and \/ D{F), the geometric average of the size of the bid queue and 
the size of the ask queue after a price change, is a measure of market depth. 



Formula ( 13 ) links properties of the price to the properties of the order flow, the left hand side 
represents the variance of price changes, whereas the right hand side only involves the tick size and 
quantities: it yields an estimator for price volatility which may be computed without observing the 
price! 

The relation (13) has an intuitive interpretation. It shows that, in two 'balanced' limit order 
markets with the same tick size and same rate of arrival of orders at the bext bid/ask, the market 
with higher depth of the next-to-best queues will lead to lower price volatility. 

More precisely, this formula shows that the microstructure of order flow affects price volatility 
through the ratio X/D{f) where A is the rate of arrival of limit orders and D{f), given by ([9]), is a 
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measure of market depth: in fact, our model predicts a proportionality between the variance of price 
increments and this ratio. This is an empirically testable prediction: Figure 5 compares, for stocks 
in the Dow Jones index, the standard deviation of 10-minute price increments with ^J \/ D{f). 

We observe that, indeed, stocks with a higher value of the ratio \/D{f) have a higher variance, 
and standard deviation of price increments increases roughly proportionally to ^ X/D{f). 




0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 



Figure 5 D{f), estimated from tick-by-tick order flow (vertical axis) vs standard deviation of 10-minute price 

increments (horizontal axis) for stocks in the Dow Jones Index, estimated from high frequency data on 
June 26, 2008. Each point represents one stock. Red line indicates the best linear approximation. 



4.3. Case when market orders and cancelations dominate 

We now consider the case in which the flow of market orders and cancellations dominates that of 
limit orders: \ <d + ^. In this case, price changes are more frequent since the order queues are 
depleted at a faster rate than they are replenished by market orders. We also obtain a diffusion 
limit though with a different scaling: 

Theorem 2 Let \<6 + fi and f a probability distribution on N'^ which satisfies 

oo oo 

m{\e + f) = ^^m{\e + n,i,j)f{i,j) < oo, 
i=i j=i 

where for all {x,y) £ (N*)^, 

/"OO /"OO pOO 

m{X,9 + fi,x,y) = dt ipx,\,i,+e{u)du ipy,x,^+e{u)du 

JQ Jt Jt 

where 4'x,\,ti+e is given by (j4]). Then 

( ^^t>o]"^°°(J . sw„t>o\ 

where W is a standard Brownian motion. 
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Proof. The sequence {t2,T3,...) is a sequence of i.i.d random variables with finite mean equal 
to m(A, 6 + fi, f). We apply the law of large numbers: 

> m{X,e + fj,J). 

n 

Therefore, 

m{X,e + nJ) 

The rest of the proof follows the lines of the proof of theorem [TJ 

Variance of price change at intermediate frequency Similarly to Theorem [T} Theorem [2] leads to 
an expression of the variance of the price at a time scale r >> Tq, where To(~ ms) is the average 
interval between order book events: 

2 "^-^ 1-2 /I /I ^ 

^ ^7om{\,e + fi,f) ^ ^ 

Here, m{X,9 + pi, f) represents the expected hitting time of the axes by the queueing system with 
parameters (A, + and random initial condition with distribution / in the positive orthant. 

As before, while the left hand side of this equation is the variance of price changes (over a time 
scale the right hand side only involves the tick size and quantities which relate to the statistical 
properties of the order flow. 

4.4. Conclusion 

We have exhibited a simple model of a limit order market in which order book events are described 
in terms of a Markovian queueing system. The analytical tractability of our model allows to compute 
various quantities of interest such as 

• the distribution of the duration until the next price change, 

• the distribution of price changes, and 

• the diffusion limit of the price process and its volatility. 

in terms of parameters describing the order flow. These results provide some insight into the relation 
between price dynamics and order flow in a limit order market. 

We view this stylized model as a first step in the elaboration of the analytical study of realistic 
stochastic models of order book dynamics. Yet, comparison with empirical data shows that even 
our simple modeling set-up is capable of yielding useful analytical insights into the relation between 
volatility and order flow, worthy of being further pursued. Moreover, the connection with two- 
dimensional queueing systems allows to use the rich analytical theory developed for these systems 



(see Cohen and Boxma (1983)) to compute many other quantities. We hope to pursue further some 
of these ramifications in future work. 

A relevant question is to examine which of the above results are robust to departures from the 
model assumptions and whether the intuitions conveyed by our model remain valid in a more 
general context where one or more of these assumptions are dropped. This issue is further studied 



in a companion paper Cont and de Larrard (2010) where we explore a more general dynamic model 



relaxing some of the assumptions above. 
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